The Heartbreak of Violated Assumptions (Healing a Broken Regression)


The assumptions of the general linear model Y = XB + E are all about the random term: the residuals are assumed to have constant variance (independent of the level of the response or predictor variables), be normally distributed, and independent. To make valid conclusions from your model, including interpreting p-values and predictions, the assumptions must be confirmed. This workshop teaches you how to check those assumptions and what to do when assumptions are violated.

Assumption checking involves both graphical and numeric procedures. Residual plots are useful for examining the normality of residuals; the variance of the residuals as a function of both the predictors and predicted response; and unexpected behavior over time, predictor values, or predicted values of the response.

The linear model can be altered to account for violations of the normality and constant variance assumptions, including transformations of the predictors and responses or fitting a more generalized linear model. Time trends in the residuals can sometimes be accounted for using time series models or adding blocking variables to the model. Some predictive models or nonparametric methods avoid these assumptions and could be an alternative, but linear regression models are the focus of this workshop.

